Detecting memory mismatch between lockstep systems using a memory signature

ABSTRACT

Examples include a first computer system operating in lockstep with a second computer system. The first computer system includes a first signature generator to generate a first digital signature representing a first write operation by a first memory controller to a first memory, the first write operation to store data at an address in the first memory, and a first signature register to store the first digital signature. The second computer system includes a second signature generator to generate a second digital signature representing a second write operation by a second memory controller to a second memory, the second write operation to store the data at the address in the second memory, and a second signature register to store the second digital signature. The first digital signature is compared to the second digital signature and a lockstep error is detected when the first digital signature does not match the second digital signature.

TECHNICAL FIELD

Examples described herein are generally related to high reliability,multiple computer systems and more particularly to high reliability,multiple computer systems in which write data is processed (compared orcopied) outside of checkpoint operations.

BACKGROUND

Some high reliability computer systems use a process known ascheckpointing to keep a second computer system in software lockstep witha first computer system. Periodically, the first computer system isstopped and the Central Processing Unit (CPU) state and any changes tothe first computer system's memory since the last checkpoint arecompared to the second computer system. In the event of a failure orunrecoverable error on the first computer system, the second computersystem will continue execution from the last checkpoint. Throughfrequent checkpointing, a second computer system can take over executionof a user's application with little noticeable impact to the user.

Memory controllers are included in computer CPUs to access a separateattached external system memory. In most high-performance computersystems, the CPU includes an internal cache memory to cache a portion ofthe system memory and uses the internal cache memory for the majority ofall memory reads and writes. When the internal cache memory is full ofchanged data and the CPU desires to write additional changed data to thecache, the memory controller writes a copy of some of the cache contentto external system memory.

High reliability computer systems use mirrored memory. A computer systemmay have memory configured to be in “mirror” mode. When memory is inmirrored mode, the memory controller which is responsible for readingthe contents of external memory to the CPU or writing data to theexternal memory from the CPU writes two copies of the data to twodifferent memory locations, a primary and secondary side of the mirror.When the memory controller is reading the data back into the CPU, thememory controller only needs to read one copy of the data from onememory location. If the data being read from the primary side has beencorrupted and has uncorrectable errors in the data, the memorycontroller reads the mirror memory secondary location to get the othercopy of the same data. As long as the memory controller is performing aread operation, the memory controller only needs to read from a singlememory location. Whenever the memory controller is performing a writeoperation (transaction), the memory controller writes a copy of the datato the primary and secondary side of the mirror. The process of makingtwo or more copies of data for enhanced reliability is referred to asmirroring and sometimes Redundant Array of Independent Disks (RAID 1).It is not necessary that the primary and secondary side of the mirrorare on different physical memory devices.

FIG. 1 is a prior art block diagram illustrating a prior art computersystem with mirrored memory. Memory modules 100, 105, and 110 are theprimary side of the memory in a computer system and memory modules 120,125, and 130 are the secondary side of the memory. Other systems have adifferent number of memory modules. CPU 115 includes cores and cachememory 175 (as well as other components), a primary memory controller135 coupled to the primary memory through interface 160, and a secondarymemory controller 140 coupled to the secondary memory through interface165. Different systems have different types and numbers of interfaces.Further, the primary and secondary memory controllers 135 and 140 couldbe two different memory controllers or two features of a single memorycontroller.

In mirroring, primary memory controller 135 and secondary memorycontroller 140 transfer the same data to the primary and secondary sideof the memory so that the data is maintained in two copies inindependent memory modules after each memory write operation. During amemory read operation 145, data is transferred from a memory module 100,105, or 110 to primary memory controller 135. In the event the data isdetermined to be correct, no further actions are necessary to completethe read operation. In the event the data is determined to be corrupted,a read 170 may be performed by the secondary memory controller 140 froma memory module 120, 125, or 130 on the secondary side of the memorywhich contains a copy of the data stored on the primary side of thememory. This leads to higher reliability because even if data in on theprimary side of memory is corrupted, a copy may be read from thesecondary side that is probably not corrupted.

Checkpointing transfers and/or compares changed data between the firstand the second computer systems. High reliability computers usingcheckpointing transfer data between the first computer system and thesecond computer system. An interface such as InfiniBand, PCI-Express(PCIe), or a proprietary interface between the computer systems is usedto transfer the CPU state and the system memory content during thecheckpointing process. The first computer system's CPU or Direct MemoryAccess (DMA) controller is usually used to transfer the contents ofmemory to the second computer system. Various methods are used to savetime transferring the content of memory from the first computer systemto the second computer system. For example, a memory paging mechanismmay set a “Dirty Bit” to indicate that a page of memory has beenmodified. During checkpointing, only the pages of memory with the DirtyBit set will be transferred. A page could be 4 Kilobytes, 2 Megabytes, 1Gigabyte or some other size. The DMA device or processor copies theentire region of memory that has been identified by a Dirty Bitregardless of whether the entire page has been changed or only a fewbytes of data in the page have changed.

Checkpointing reduces a computer system's performance. While thecomputer system is performing the checkpointing task, the computersystem generally is not doing useful work for the user, so the userexperiences reduced performance. There is always a tradeoff betweenfrequency of checkpointing intervals, complexity of the method toefficiently transfer checkpoint data, and latency delays that the userexperiences. Minimum latency can be realized by only transferring thedata that has been changed in the computer memory.

Checkpointing may be used when both a first computer system and a secondcomputer system are executing the same instructions. When both computersystems are executing the same code at the same time, they may beperiodically stopped and the contents of the CPU registers and memorycontents compared with each other. If the computer systems haveidentical CPU register values and memory contents, they are allowed tocontinue processing. When both computer systems are comparing memory andregister values, a low latency comparison exists when only the data thathas been changed is compared between the two systems. Various methodshave been used in the prior art to reduce the amount of time necessaryto copy the contents of external memory to the second computer system.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram representation of a prior art high reliabilitycomputer using memory mirroring.

FIG. 2 is a block diagram representation of a high reliability dualcomputer system according to some embodiments of the invention.

FIG. 3 is a block diagram representation of a high reliability dualcomputer system according to some embodiments of the invention.

FIG. 4 is a block diagram representation of further details of FIG. 3according to some embodiments of the invention.

FIG. 5 is a block diagram representation of a high reliability dualcomputer system according to some embodiments of the invention.

FIG. 6 is a block diagram representation of a high reliability dualcomputer system according to some embodiments of the invention.

FIG. 7 is a block diagram representation of a high reliability dualcomputer system according to some embodiments of the invention.

FIG. 8 is a block diagram representation of further details of FIG. 2according to some embodiments of the invention.

FIG. 9 is a block diagram representation of further details of FIG. 2according to some embodiments of the invention.

FIG. 10 is a block diagram representation of a first computer systemaccording to an embodiment of the invention.

FIG. 11 is a block diagram representation of a first computer systemaccording to another embodiment of the invention.

FIG. 12 is a block diagram representation of a first computer system anda second computer system according to an embodiment of the invention.

FIG. 13 is a block diagram representation of a first computer system anda second computer system according to another embodiment of theinvention.

FIG. 14 is a flow diagram of processing by a first computer system and asecond computer system according to an embodiment of the invention.

DETAILED DESCRIPTION

This disclosure relates to high reliability computer architectures.Specifically, this disclosure describes a low latency method ofcheckpointing to keep two computers in lockstep. In some embodiments(online, offline mode), the checkpointing operation can be performedfaster because data is transferred during normal operation and does notneed to be transferred during the checkpoint operation. In otherembodiments (software lockstep mode), data does not need to be comparedduring the checkpoint operation because the data is compared duringnormal operation.

Memory controllers typically write only changed or new data to mainmemory (external memory modules), and when the computer system is usingmirrored memory, the memory controller writes a duplicate copy of thenew or changed data to both the primary and the secondary side of themirror. By modifying the memory controller or the memory device totransfer data to a second computer system while writing the data tomemory, checkpointing overhead is reduced or eliminated for the memorycopy portion of checkpointing.

In some embodiments, a form of checkpointing (offline checkpointing) isused in which a first computer system (e.g., an online system) runs auser's application and periodically stops to copy internal and externaldata and the CPU state to a second computer (e.g., an offline system).The need to transfer memory contents during the checkpoint operation isreduced or eliminated by transferring data from the online system to theoffline system during each memory write operation (transaction) whilethe first computer system is running the user's application.

In other embodiments, another form of checkpointing is used in whichboth a first and a second computer system are running a user'sapplication concurrently (software lockstep mode). Periodically, bothcomputer systems are stopped at the same time and point in anapplication. One system may be slightly ahead or behind the othersystem, so the system that is behind is allowed to run additionalinstructions until the two systems are stopped on the same instruction.Then the internal and external memory and CPU state are compared. Someembodiments reduce the need to compare external memory contents duringthe checkpoint operation by performing the external memory compare everytime data is written to memory. Some embodiments only support softwarelockstep mode and other embodiments only support online, offline mode.Still other embodiments support both software lockstep mode and online,offline mode.

FIG. 2 is a block diagram illustrating some embodiments of a lowoverhead checkpointing system. FIG. 2 may be used to implement eitherform of checkpointing (software lockstep or online, offline modes) andvariations of them described below.

In FIG. 2, primary system 200 includes CPU1 204, memory modules 100,105, and 110 on the primary memory side, and memory modules 208, 125,and 130 on the secondary memory side. CPU1 includes cores and cachememory 282 (which may be the same as or different than cores and cachememory 175), primary memory controller 212 and secondary memorycontroller 214, as well as various other components. Primary andsecondary memory controllers 212 and 214 may be on the same die as CPUcores and cache memories 282 or on a different die. Primary andsecondary memory controllers 212 and 214 may be separate memorycontrollers or two features of the same memory controller. CPU1 204,primary and secondary memory controllers 212 and 214 may be the same asor different than CPU 115, primary and secondary memory controllers 135and 140 in FIG. 1.

Secondary system 202 includes CPU2 238, memory modules 232, 234, and 236on the primary memory side, and memory modules 240, 242, and 244 on thesecondary memory side. CPU2 includes CPU cores and cache memories 284(which may be the same as or different than cores and cache 282),primary memory controller 252 and secondary memory controller 254 andother components. Memory module 208 includes memory devices andinter-memory transfer interface 228, and memory module 240 includesmemory devices and inter-memory transfer interface 258.

In some embodiments, primary memory controller 212 and secondary memorycontroller 214 transfer the same data to the primary and secondary sideof the memory so that the data is maintained in two copies inindependent memory modules during each memory write operation.

There are different ways in which memory write operations may beperformed in different embodiments. FIGS. 8 and 9 illustrate some ofthese. Referring to FIGS. 2 and 8, during a memory write operation, CPU1204 transfers data by writing 155 to a memory module 100, 105, or 110 onthe primary side of the memory using memory interconnect 160.Concurrently with the write 155 to the primary side of the memory, CPU1204 transfers data by writing 226 to inter-memory transfer interface 228in memory module 208 on the secondary side using memory interconnect165. Data is transferred 230 to memory in memory module 208, 125, or130. During the write 226 process, inter-memory transfer interface 228on memory module 208 signals secondary system 202 with information aboutthe write using private interface 280 (which is an example of aninterconnect). Secondary system inter-memory transfer interface 258receives the information about the write over private interface 280. Theinter-memory transfer interface 258 on secondary system secondary sidememory module 240 performs a write 262 to memory in secondary sidememory modules 240, 242, or 244. Note that memory controller 254 may bethe same as or different than memory controller 214. Likewise, memorycontrollers 212 and 214 may be the same as each other or different andmemory controllers 252 and 254 may be the same as or different (and maybe separate memory controllers or two features of the same memorycontroller). The inter-memory transfer interfaces may be, for example,Application Specific Integrated Circuits (ASICs), Field ProgrammableGate Arrays (FPGAs), or integrated into Dynamic Random-Access Memory(DRAM) devices.

In some embodiments for online, offline mode, secondary memorycontroller 254 in system 202 receives information 256 from inter-memorytransfer interface 258 and causes CPU2 238 to write the same data to theprimary side memory modules 232, 234, or 236 using primary memorycontroller 252. Upon completion of the writes 155, 226, 230, 262, and248, the memory contents of the secondary system will be the same as thememory contents of the primary system. During the next offlinecheckpointing event, in some embodiments, there will be no need totransfer memory content or compare memory content because every writeoperation on the primary system has been repeated on the secondarysystem.

In some embodiments for online, offline mode, the secondary systeminter-memory transfer interface 258 does not cause the data to bewritten to the primary side of the mirror so that the primary sidecontains the memory image of the last checkpoint operation. Writeinformation provided over interface 280 is written to memory modules240, 242, or 244 but is not transferred by CPU2 238 to the secondarysystem, primary memory. As the primary system runs, there is apossibility that there will be incorrect data written to the memory. Ifincorrect data is written to both sides of the mirrored memory on theprimary system 200, and a copy of the bad data is written to thesecondary system 202, there is a correct copy of data on the primaryside of the mirror on the secondary system 202. To recover data or theoperation during a checkpoint operation, the data from the previouscheckpoint operation may be read from the secondary system 202 primarymemory controller 252. In some embodiments, when data is only written tothe secondary memory, during checkpointing the changed data on thesecondary side of the mirror can be transferred to the primary side,thus preserving the previous checkpointed data on the primary side untilit is safe to update with the changed data on the other side.

In some embodiments using the software lockstep mode, primary computersystem 200 and secondary computer system 202 execute the same userprogram and run in software lockstep. Each computer system executes thesame instructions at almost the exact same time. When the primarycomputer system 200 and the secondary computer system 202 write data tothe primary system, secondary memory (in module 208, 125 or 130) and thesecondary system, secondary memory (in module 240, 242, or 244),inter-memory transfer interface 228 and the inter-memory transferinterface 258 may compare the write information from transactions 226and 256 when the write operations occur. During the next softwarelockstep checkpoint operation, memory contents do not need to becompared because every write occurring in the first system is comparedto every write occurring in the second system concurrently with thewrites by the inter-memory transfer interfaces 228 or 258 or both 228and 258. The comparison of information related to write operations maybe of the entire provided write information or merely a portion of it.Accordingly, at least some of the information is compared.

Referring again to FIGS. 2 and 8, in FIG. 8, writes pass throughinter-memory transfer interfaces 228 and 258 before passing to memory810 and 820 or other memory in modules 125, 130, 242, or 244 oninterfaces 800 or 805. When information is received over privateinterface 280, a write to memory 230 or 262 can occur without usingconnection 165 or 260. Likewise, read data from modules 125, 130, 242,or 244 passes through interfaces 228 or 258 before being passed oninterface 165 or 260. By contrast, in FIG. 9, data can be written to orread from memory 810, 904, or memory in modules 125, 130, 242, or 244without passing through inter-memory transfer interfaces 902 or 904.Note that interfaces 902 and 904 may be the same as or different thaninterfaces 228 and 258.

FIG. 3 illustrates some alternative embodiments. Referring to FIG. 3,primary system 300 and secondary system 302 are like systems 200 and 202of FIG. 2 except that inter-memory interfaces 228 and 258 are notincluded in FIG. 8, and memory controllers 214 and 254 or FIG. 2 arereplaced with data transfer interfaces 316 and 352 in FIG. 3. Further,private interface 280 is replaced with private interface 330 (which isan example of an interconnect) in the system of FIG. 3. Also, in FIG. 8,module 120 and 230 replace modules 208 and 240 of FIG. 2. (Note thatalthough the modules are labeled Dual Inline Memory Modules (DIMM)s,they do not have to be DIMMs.)

In online, offline mode, during a memory write operation, CPU 304transfers data by writing 155 to a memory module 100, 105, or 110 on theprimary side of the memory using memory interconnect 160. Concurrentlywith the write 155 to the primary side of the memory, data transferinterface 316 transfers data by writing 150 to a memory module 120, 125,or 130 on the secondary side of the memory using memory interconnect165. During the write 150 process, data transfer interface 316 signalssecondary system 302 with information about the write using privateinterface 330. Secondary system data transfer interface 352 receives theinformation about the write from private interface 330. The datatransfer interface 352 on secondary system CPU2 338 performs a write 366to secondary side memory device 360, 242, or 244 and in some embodimentscauses primary memory controller 252 to write (248) the same informationto the primary memory in module 232, 234, or 236.

In some embodiments of online, offline mode, secondary system datatransfer interface 352 transfers the information about the write fromprivate interface 330 to the primary memory in module 232, 234, or 236and secondary memory in module 360, 242, or 244 so that the data ismaintained in two copies in independent memory modules during eachmemory write operation.

In some embodiments of online, offline mode, secondary system datatransfer interface 352 transfers the signaled data from privateinterface 330 data to only the secondary 360, 242, and 244 side of thememory, preserving the contents of the primary side of the memory untilthe checkpointing process allows the changed data to be written to theprimary side of the memory.

In some embodiments of the software lockstep mode, primary system 300and secondary system 302 are running the same user applicationconcurrently in software lockstep. When the two systems perform writeoperations (155, 150, 248, and 366) to primary and secondary memory, theprimary system data transfer interface 316 and/or secondary system datatransfer interface 352 compare information about write operations usinginformation provided over private interface 330. During a softwarelockstep checkpoint operation, the contents of memory may not need to becompared because during each write operation while the primary andsecondary systems are running, the write data is compared.

FIG. 4 provides additional detail of some embodiments of FIG. 3. Datatransfer interface 316 includes a memory controller 405 and aninter-computer transfer interface 410. Data transfer interface 352includes a second inter-computer transfer interface 415 and a memorycontroller 420. First inter-computer transfer interface 410 detects whena write occurs from CPU 304 over interface 400 to memory controller 405.Information about the write, such as the data being written, the addressin memory it is being written to, and, optionally, the time that thedata write occurred is transferred to the second inter-computer transferinterface 415 using a private interface 330.

In some embodiments in on-line offline mode, when interface 415 receivesfrom interface 410 information about a data write, that interface 415causes the second memory controller 420 to write a copy of the data frominterface 410 to the second system memory attached to memory interface260.

In some embodiments when systems 300 and 302 are operating in softwarelockstep, interface 410 detects when CPU 304 writes to memory controller405. Information about the write, such as the data being written, theaddress in memory it is being written to, and, optionally, the time thatthe data write occurred is transferred by interface 410 to interface 415using private interface 330. Interface 415 detects when CPU 338 writesover interface 425 to memory controller 420. Information about thewrite, such as the data being written, the address in memory it is beingwritten to, and, optionally, the time that the data write occurred iscompared to the information signaled from interface 410. If the data isthe same, the memory does not need to be compared during the nextsoftware lockstep checkpoint because all of the changed values werecompared when written to memory, thus reducing the time needed toperform software lockstep checkpointing. The comparison can be performedin interface 410 or in 415 or in both 410 and 415. In alternativeembodiments, the comparison could be performed in other circuitry of thesystem outside the interfaces. For example, the comparison could beperformed in the cores, the memory controller, or other circuitry of theCPUs.

FIG. 5 illustrates other embodiments. FIG. 5 is similar to FIG. 3 exceptthat systems 500 and 502 do not include memory modules connected to thedata transfer interfaces 316 and 352. Accordingly, there will be writesto the primary side of the second system 502 in on-line, offline mode.

FIG. 6 illustrates other embodiments. FIG. 6 is similar to FIG. 5 exceptthat in systems 600 and 602, the data transfer interface 316 and 352replace memory controllers 212 and 252. In still other embodiments,systems like those in FIG. 3 could have data transfer interfaces 316 and352 on the primary side and memory controllers 212 and 252 on thesecondary side.

FIG. 7 illustrates other embodiments. FIG. 7 is similar to FIG. 2,except that modules 208 and 240 are on the primary side and there is nosecondary side. In some embodiments, there could also be a secondaryside. In other words, in FIG. 2, modules 208 and 240 could be swappedwith modules 110 and 236 with private interface 280 being moved as well.

Some embodiments of the present invention comprise an unobtrusivesideband check on memory writes for lockstep systems (as describedabove) that provide an earlier indication that a CPU is no longer inlockstep with another CPU than by means of stopping and comparing thestate of both CPUs. When using these embodiments, the user may have moreconfidence in the stability of the lockstep systems and use thatadditional confidence to relax the frequency of checkpoint activitythereby, increasing perceived performance of the lockstep systems. Theseembodiments provide advantages over the embodiments described above,because those embodiments for hardware and software lockstep solutionsrequired saving the CPU context and comparing memory locations as wellas CPU registers individually, which slows down the checkpointing step.Also, until a checkpoint occurs, the user is not aware if a CPU hasbroken out of lockstep.

In response, in embodiments described below a signature generator isprovided that creates a unique digital signature for address and datainformation being written to main memory from a CPU 204, 238. Thisinformation is generated on both systems 200, 202 in a high reliabilityserver running in lockstep. The signatures are compared in real-time toensure that memory writes on both systems contain the same informationand are sent to the same address at nearly the same time. This providesan advantage that CPUs falling out of lockstep are detected sooner thanin other methods. Thus, checkpoint operations may be scheduled withlonger delays between them since this embodiment provides earlierdetection of issues.

In one embodiment, the digital signature generated by the signaturegenerator is a hash of at least the address and data. In anotherembodiment, the digital signature generated by the signature generatoris a cyclic redundancy check (CRC) value computed over at least theaddress and the data. In another embodiment, other information, such asa time stamp, counter value or nonce, for example, may also be includedin the digital signature. In other embodiments, other digital signaturesmay be used depending on implementation choices. Any suitable hash orCRC computation known in the art may be used for digital signaturegeneration.

As described above, hardware lockstep uses two identical computersystems running in clock by clock “lockstep” so that each computersystem is executing the same instruction at approximately the same time,and both systems should have identical content in their memories. Thetwo computer systems appear to be a single system to the user. Softwarelockstep also uses two systems, one running as the primary and the otherrunning as the secondary. The secondary system monitors the primarysystem to ensure that it continues to operate. Both systems contain thesame memory image which allows either one to continue to operate if onesystem fails. However in order to stay in lockstep, the primary andsecondary system periodically halt; one system will execute enoughadditional instructions to “catch up” to the other, then they compareinternal states and memory contents (changes to memory since the lastcomparison) in order to verify that both systems are operatingcorrectly.

Having a redundant set of computer systems provides a mechanism formaintaining highly reliable operations. If one system has a failure, theother system continues to run the user's software (providing the highreliability) and allows the failing computer/component to be reset orreplaced while still running the user's software. Once ready to continueoperation, the system is brought back into lockstep (hardware orsoftware) and high reliable operation can continue. While only onecomputer system is in operation the system is subject to a secondfailure and is not in a highly reliable condition.

To recover from a failure, a system management interrupt (SMI) isgenerated on the running system, and during the interrupt routine, therunning system copies the current state of the operating processor andthe entire memory contents of the operating processor to the recoveringor second system. After copying the contents, including on-CPU cache andCPU registers, both systems will contain identical data. Once both CPUsand memory contents match, a resume from SMI can be executedsimultaneously causing the two systems to run in a high reliabilitystate.

A feature of a hardware lockstep system is the detection of when the twocomputer systems fall out of lock (lockstep error). A feature of asoftware lockstep system is ensuring an exact copy of memory exists (insome scenarios, a software lockstep system detects a failure by a simple“are you there” exchange.)

Some embodiments provide a memory write access digital signaturemechanism for computer systems running in a high reliability mode 1) todetect a hardware out of lock condition (e.g., a lockstep error), 2) tominimize the amount of memory that needs to be copied to get back intolockstep (to minimize the amount of time the system is not running inlockstep), 3) by comparing the digital signatures, two systems canquickly determine if the memory contents are the same, and 4) bycomparing the difference in time for when the digital signatures arecreated, the amount of slip between two systems can be measured withouthaving to stop them and compare program counters.

FIG. 10 is a block diagram representation of a first computer system1000 according to an embodiment of the invention. First computer system1000 includes CPU 1 1002 having CPU cores and cache memory 1004 andmemory controller 1006. Memory controller 1006 is coupled to memory1008. Embodiments of the present invention require a memory space,denoted herein as one or more signature registers 1012, to store one ormore digital signatures, each digital signature representing performanceof a write operation of the address and data being written out of CPU 11002 by memory controller 1006 into memory 1008. In an embodiment, thereare one or more signature registers 1012, such as sig 1 1014, sig 2 106,sig 3 108, . . . sig N 1020, where N is a natural number. In anembodiment, signature registers may be used as a first-in, first-out(FIFO) queue (e.g., the oldest signatures in the signature registers arereplaced with newer signatures). In an embodiment, CPU 1 1002 includessignature generator 1010 to generate a hash value or CRC based at leastin part on the address and data (and optionally the time that the writeoccurred) for a write operation. In an embodiment, separate sets ofsignature registers 1012 may be associated with each memory controller,if the CPU include multiple memory controllers. A hash or CRC can beused with high confidence to detect differences in memory contentbecause the hash or CRC has the characteristic that if a single bit isdifferent between two sets of (address and) memory contents, the hash orCRC will be different. It is unlikely that a multiple bit error in thedata and address would be created such that the hash or CRC codes wouldmatch. As used herein, “hash” or “CRC” may be used alone or incombination without limitation, and they may be substituted for eachother.

In an embodiment, signature registers 1012 are cleared by a power onreset or by a model specific register (MSR) write to the signatureregisters. In various embodiments, signature registers may be global foran entire system, pertain to a single CPU (such as CPU 1 1002), singlememory controller 1006, single or multiple memory channels, multipleCPUs, or multiple memory controllers.

During operation, writes from CPU cores 1004 to internal memory arecaptured in the CPU's cache. When more writes occur than the cache canhold, an external memory write may be used to take the least frequentlyused cache entry and write the contents to memory 1008. During thiswrite to memory 1008, signature generator 1010 creates a hash or CRCvalue for at least the address and content of the memory being writtenthereby creating a unique digital signature for that combination ofaddress and data. A second computer system running in lockstep will beexecuting the same instructions at the same time and will thereforegenerate the same hash or CRC value simultaneously during the write tomemory. In this scenario, immediate detection of a second computersystem falling out of lockstep with a first computer system is provided.

FIG. 11 is a block diagram representation of a first computer system1100 according to another embodiment of the invention. First computersystem 1100 is similar to first computer system 1000 of FIG. 10, but inthis embodiment signature generator 1010 and signature registers 1012are not located on CPU 1 1002 (e.g., not on the processor die). In thisembodiment signature generator 1010 and signature registers 1012 arelocated on another component 1102 of first computer system 1100accessible by CPU 1 1002.

FIG. 12 is a block diagram representation of a first computer system1000 and a second computer system 1200 according to an embodiment of theinvention. In this embodiment, first computer system 1000 and secondcomputer system 1200 are operating in lockstep. First computer system1000 includes CPU 1 1002 having CPU cores and cache memory 1004, memorycontroller 1006, signature generator 1010, signature registers 1012, andmemory 1008 as shown in FIG. 10. Similarly, second computer system 1200includes CPU 2 1202 having CPU cores and cache memory 1204, memorycontroller 1206, signature generator 1210, signature registers 1212, andmemory 1208. In this embodiment, when a digital signature is generatedby signature generator 1010 (as a result of performance of writeoperation to memory 1008) and stored in signature registers 1012 on CPU1 1002, the digital signature is also sent (represented by line 1214) toCPU 2 1202 (e.g., to CPU cores 1204). CPU 2 1202, operating in lockstep,generates a digital signature for the matching write operation to memory1208. CPU 2 1202 then compares the digital signature received from CPU 11002 with the digital signature generated by CPU 2 1202 and detects if alockstep error has occurred. In an embodiment, optionally the time thatthe digital signatures were generated is also compared to determine if alockstep error has occurred. In this embodiment, the comparison isperformed by hardware in CPU 2 1202 or by software instructions beingexecuted by CPU 2 1202.

FIG. 13 is a block diagram representation of a first computer system1000 and a second computer system 1200 according to another embodimentof the invention. In this embodiment, signature registers 1312 arecommonly used by first computer system 1000 and second computer system1200 and are not integral with CPU 1 1002 or CPU 2 1202 (e.g., not onCPU 1's die or CPU 2's die). Signature registers 1314 and comparatorcircuitry 1314 are instead in a separate component 1316 of overallcomputing system 1300. Each of CPU 1 1002 and CPU 2 1202 store digitalsignatures in commonly used signature registers 1312 in component 1316.Comparator circuitry 1314 in component 1316 compares a digital signaturewritten by CPU 1 1002 to an associated (e.g., by lockstep) digitalsignature written by CPU 2 1202 and indicates a lockstep error if thedigital signatures do not match.

FIG. 14 is a flow diagram 1400 of processing by a first computer system1000 and a second computer system 1200 according to an embodiment of theinvention. The actions of FIG. 14 may be performed by the embodimentshown in FIG. 12. On CPU 1 1002, at block 1402 memory controller 1006receives a request to write to memory 1008. At block 1406, signaturegenerator 1010 generates a digital signature representing the writeoperation. Signature generator 1010 stores the signature in one of thesignature registers 1012 at block 1410. Substantially in parallel, onCPU 2 1202 at block 1404 memory controller 1206 receives a request towrite to memory 1208. At block 1408, signature generator 1210 generatesa digital signature representing the write operation. Signaturegenerator 1210 stores the signature in one of the signature registers1212 at block 1412. At block 1414, CPU 1 1002 sends the signature storedin signature register 1012 to CPU 2 1202. At block 1416, CPU 2 1202receives the signature from CPU 1. At block 1418, CPU 2 1202 comparesthe signature received from CPU 1 to the signature generated bysignature generator 1210 on CPU 2 1202. If the signatures match (andoptionally the time when the signatures were generated match), the firstand second computer systems are in lockstep operation and processingcontinues with blocks 1402 and 1404, by CPU 1 and CPU 2, respectively.If the signatures do not match, at block 1420, CPU 2 1202 initiatesactions to correct the error. In an embodiment these actions includerestarting the CPUs at the last known valid location (e.g., where theCPUs were known to be in lockstep) in the program being executed by theCPUs.

In another embodiment, the digital signatures saved in signatureregisters 1012 are not compared immediately (e.g., in real-time) but ona periodic basis. In an embodiment, the digital signatures are comparedwhen a checkpoint operation is performed. In either of theseembodiments, the digital signature mechanism described above is used tocreate a set of DIMM CRC registers (called memory block signatureregisters herein) that are placed in the memory controller's writebuffers for each DIMM (including both address and data). The memoryblock signature registers can be read via an MSR read operation. Thememory block signature registers are cleared on a power on reset or byan MSR write to a memory block signature register.

After a software lockstep checkpoint operation completes (e.g., theprimary computer system has copied all the memory content writes to thesecondary computer system) the memory block signature registers can becompared to ensure the contents of the memories are identical. Not allof memories will be identical since the two systems are runningdifferent software, but the section of memory containing the backupmemory image will be identical and have identical signatures.

For a system using a hardware mechanism as described above to copy thememory contents, the same mechanism is used to verify the contents areindeed still identical.

For a system not using a hardware mechanism as described above, thesignature generator feature can be used to determine when a memory blockneed to be copied during a checkpoint operation (to copy any changedmemory from the primary computer system to the secondary computersystem); the signature registers could be compared between the twosystems to indicate which sections of memory have changed and need to becopied.

When a hardware lockstep miss-compare occurs the memory block signatureregisters can be interrogated and compared between the two computersystems that had been running in lockstep. If the memory block signatureregisters are the same, the memory contents are identical and no copy isrequired. If any registers are different this indicates that there is amemory difference on the corresponding memory and that memory must becopied from the “good” memory contents to the “bad” memory contents tomake the memory contents of the two computer systems identical. Once allthe necessary copies are complete the two computer's memory systems areagain identical; the computer systems can then be taken back intohardware lockstep by a reset and resume from SMI.

An optimum number of memory block signature registers can be determinedby how long it will take to perform the memory copy. Additionalregisters can be added to limit the amount of memory (partial DIMM) thatneeds to be copied, reducing the time the system is out of lock.

A system running in hardware lockstep could have a “slip” in theexecution, for example, one system may have a stall for a correctableerror correcting code (ECC) error while the other system would not needthe correction. Both systems will be using identical data (since theerror was corrected), but one system will have slipped its execution byone or more clocks since the data was delivered to the processor at alater time. Subsequent writes to the write buffer will have identicaladdresses and data but will be output at different times. This could(depending on the implementation) generate an out of lock indication,however all memory block signature registers will have the same values(one write will be delayed but will occur) so no memory copy is neededand letting one CPU execute a few more instructions to catch up to theother CPU will quickly put the system back into lockstep. Likewise, evenif the systems are slightly out of lockstep, there can be a timingthreshold where operations may continue for a brief time until theslower system makes a memory write. If the length of time is greaterthan a predetermined threshold, the systems may need to be stopped andre-aligned.

One or more aspects of at least one example may be implemented byrepresentative instructions stored on at least one tangible,non-transitory machine-readable medium which represents various logicwithin the processor, which when read by a machine, computing device orsystem causes the machine, computing device or system to fabricate logicto perform the techniques described herein. Such representations, knownas “IP cores” may be stored on a tangible, machine readable medium andsupplied to various customers or manufacturing facilities to load intothe fabrication machines that actually make the logic or processor.

Various examples may be implemented using hardware elements, softwareelements, or a combination of both. In some examples, hardware elementsmay include devices, components, processors, microprocessors, circuits,circuit elements (e.g., transistors, resistors, capacitors, inductors,and so forth), integrated circuits, ASIC, programmable logic devices(PLD), digital signal processors (DSP), FPGAs, AI cores, memory units,logic gates, registers, semiconductor device, chips, microchips, chipsets, and so forth. In some examples, software elements may includesoftware components, programs, applications, computer programs,application programs, system programs, machine programs, operatingsystem software, middleware, firmware, software modules, routines,subroutines, functions, methods, procedures, software interfaces,application program interfaces (API), instruction sets, computing code,computer code, code segments, computer code segments, words, values,symbols, or any combination thereof. Determining whether an example isimplemented using hardware elements and/or software elements may vary inaccordance with any number of factors, such as desired computationalrate, power levels, heat tolerances, processing cycle budget, input datarates, output data rates, memory resources, data bus speeds and otherdesign or performance constraints, as desired for a givenimplementation.

Some examples may include an article of manufacture or at least onecomputer-readable medium. A computer-readable medium may include anon-transitory storage medium to store logic. In some examples, thenon-transitory storage medium may include one or more types ofcomputer-readable storage media capable of storing electronic data,including volatile memory or non-volatile memory, removable ornon-removable memory, erasable or non-erasable memory, writeable orre-writeable memory, and so forth. In some examples, the logic mayinclude various software elements, such as software components,programs, applications, computer programs, application programs, systemprograms, machine programs, operating system software, middleware,firmware, software modules, routines, subroutines, functions, methods,procedures, software interfaces, API, instruction sets, computing code,computer code, code segments, computer code segments, words, values,symbols, or any combination thereof.

Some examples may be described using the expression “in one example” or“an example” along with their derivatives. These terms mean that aparticular feature, structure, or characteristic described in connectionwith the example is included in at least one example. The appearances ofthe phrase “in one example” in various places in the specification arenot necessarily all referring to the same example.

Included herein are logic flows or schemes representative of examplemethodologies for performing novel aspects of the disclosedarchitecture. While, for purposes of simplicity of explanation, the oneor more methodologies shown herein are shown and described as a seriesof acts, those skilled in the art will understand and appreciate thatthe methodologies are not limited by the order of acts. Some acts may,in accordance therewith, occur in a different order and/or concurrentlywith other acts from that shown and described herein. For example, thoseskilled in the art will understand and appreciate that a methodologycould alternatively be represented as a series of interrelated states orevents, such as in a state diagram. Moreover, not all acts illustratedin a methodology may be required for a novel implementation.

A logic flow or scheme may be implemented in software, firmware, and/orhardware. In software and firmware embodiments, a logic flow or schememay be implemented by computer executable instructions stored on atleast one non-transitory computer readable medium or machine readablemedium, such as an optical, magnetic or semiconductor storage. Theembodiments are not limited in this context.

Some examples are described using the expression “coupled” and“connected” along with their derivatives. These terms are notnecessarily intended as synonyms for each other. For example,descriptions using the terms “connected” and/or “coupled” may indicatethat two or more elements are in direct physical or electrical contactwith each other. The term “coupled,” however, may also mean that two ormore elements are not in direct contact with each other, but yet stillco-operate or interact with each other.

It is emphasized that the Abstract of the Disclosure is provided tocomply with 37 C.F.R. Section 1.72(b), requiring an abstract that willallow the reader to quickly ascertain the nature of the technicaldisclosure. It is submitted with the understanding that it will not beused to interpret or limit the scope or meaning of the claims. Inaddition, in the foregoing detailed description, it can be seen thatvarious features are grouped together in a single example for thepurpose of streamlining the disclosure. This method of disclosure is notto be interpreted as reflecting an intention that the claimed examplesrequire more features than are expressly recited in each claim. Rather,as the following claims reflect, inventive subject matter lies in lessthan all features of a single disclosed example. Thus, the followingclaims are hereby incorporated into the detailed description, with eachclaim standing on its own as a separate example. In the appended claims,the terms “including” and “in which” are used as the plain-Englishequivalents of the respective terms “comprising” and “wherein,”respectively. Moreover, the terms “first,” “second,” “third,” and soforth, are used merely as labels, and are not intended to imposenumerical requirements on their objects.

Although the subject matter has been described in language specific tostructural features and/or methodological acts, it is to be understoodthat the subject matter defined in the appended claims is notnecessarily limited to the specific features or acts described above.Rather, the specific features and acts described above are disclosed asexample forms of implementing the claims.

What is claimed is:
 1. A system comprising: a first computer systemoperating in lockstep with a second computer system; the first computersystem including a first memory and a first central processing unit(CPU), a first memory controller, a first signature generator togenerate a first digital signature representing a first write operationby the first memory controller to the first memory, the first writeoperation to store data at an address in the first memory, and a firstsignature register to store the first digital signature; and the secondcomputer system including a second memory and a second CPU, a secondmemory controller, a second signature generator to generate a seconddigital signature representing a second write operation by the secondmemory controller to the second memory, the second write operation tostore the data at the address in the second memory, and a secondsignature register to store the second digital signature; wherein thesecond CPU to compare the first digital signature to the second digitalsignature and to detect a lockstep error when the first digitalsignature does not match the second digital signature.
 2. The system ofclaim 1, wherein the first signature generator to generate a first hashvalue of at least the data and the address as the first digitalsignature and the second signature generator to generate a second hashvalue of at least the data and the address as the second digitalsignature.
 3. The system of claim 1, wherein the first signaturegenerator to generate a first hash value of at least the data, theaddress, and a first time stamp as the first digital signature, and thesecond signature generator to generate a second hash value of at leastthe data, the address, and a second time stamp as the second digitalsignature.
 4. The system of claim 1, wherein the first signaturegenerator to generate a first cyclic redundancy check (CRC) value of atleast the data and the address as the first digital signature, and thesecond signature generator to generate a second CRC value of at leastthe data and the address as the second digital signature.
 5. The systemof claim 1, wherein the first computer system includes the first memoryand the first CPU and the first memory controller, and a first componentincluding the first signature generator and the first signatureregister; and the second computer system includes the second memory andthe second CPU and the second memory controller, and a second componentincluding the second signature generator and the second signatureregister.
 6. The system of claim 1, comprising: a third componentincluding one or more signature registers and a comparator, thecomparator to compare at least two digital signatures of the one or moresignature registers to detect the lockstep error when the at least twodigital signatures do not match; wherein the first computer systemincludes the first memory and the first CPU, the first memorycontroller, and the first signature generator; and wherein the secondcomputer system includes the second memory and the second CPU, thesecond memory controller, and the second signature generator.
 7. Amethod comprising: receiving, at a first memory controller of a firstcomputer system including a first memory and a first central processingunit (CPU) and the first memory controller, a request to perform a firstwrite operation by the first memory controller to the first memory, thefirst write operation to store data at an address in the first memory;generating, by a first signature generator of the first computer system,a first digital signature representing the first write operation;storing, by the first signature generator, the first digital signaturein a first signature register of the first CPU; receiving, at a secondmemory controller of a second computer system including a second memoryand a second CPU and the second memory controller, a request to performa second write operation by the second memory controller to the secondmemory, the second write operation to store the data at the address inthe second memory, the second computer system operating in lockstep withthe first computer system; generating, by a second signature generatorof the second computer system, a second digital signature representingthe second write operation; storing, by the second signature generator,the second digital signature in a second signature register of thesecond CPU; and comparing the first digital signature to the seconddigital signature and detecting a lockstep error when the first digitalsignature does not match the second digital signature.
 8. The method ofclaim 7, wherein the first signature generator generates the first hashvalue of at least the data and the address as the first digitalsignature and the second signature generator generates the second hashvalue of at least the data and the address as the second digitalsignature.
 9. The method of claim 7, wherein the first signaturegenerator generates the first hash value of at least the data, theaddress, and a first time stamp as the first digital signature, and thesecond signature generator generates the second hash value of at leastthe data, the address, and a second time stamp as the second digitalsignature.
 10. The method of claim 7, wherein the first signaturegenerator generates a first cyclic redundancy check (CRC) value of atleast the data and the address as the first digital signature, and thesecond signature generator generates a second CRC value of at least thedata and the address as the second digital signature.
 11. The method ofclaim 7, wherein the first computing system performs the receiving,generating, and storing steps substantially in parallel with the secondcomputing system performing the receiving, generating, and storing. 12.The method of claim 11, comprising sending the first digital signatureto the second CPU before comparing the first digital signature to thesecond digital signature.
 13. The method of claim 7, comprisingrestarting the first CPU and the second CPU at a last known validlocation when the lockstep error is detected.
 14. At least onenon-transitory machine-readable medium comprising a plurality ofinstructions that in response to being executed by a processor in acomputer system cause the computer system to: receive, at a memorycontroller of the computer system including a memory and a CPU and thememory controller, a request to perform a write operation by the memorycontroller to the memory, the write operation to store the data at anaddress in the memory, the computer system operating in lockstep with asecond computer system; generate, by a signature generator of thecomputer system, a digital signature representing the write operation;store, by the signature generator, the digital signature in a signatureregister of the CPU; receive a second digital signature representing asecond write operation by a second memory controller to a second memoryin the second computer system; and compare the digital signature to thesecond digital signature and detect a lockstep error when the digitalsignature does not match the second digital signature.
 15. The at leastone non-transitory machine-readable medium of claim 14, comprisinginstructions, that when executed, generate a hash value of at least thedata and the address as the digital signature.
 16. The at least onenon-transitory machine-readable medium of claim 14, comprisinginstructions, that when executed, generate a hash value of at least thedata, the address, and a time stamp as the digital signature.
 17. The atleast one non-transitory machine-readable medium of claim 14, comprisinginstructions, that when executed, generate a cyclic redundancy check(CRC) value of at least the data and the address as the digitalsignature.
 18. The at least one non-transitory machine-readable mediumof claim 14, comprising instructions, that when executed, restart theCPU at a last known valid location when the lockstep error is detected.